FRECLE Mining: Discovering Frequent Semantic Tree Cluster Sequences from Historical Tree Structured Data

نویسندگان

Ling Chen

Sourav S Bhowmick

چکیده

Mining frequent trees is very useful in domains like bioinformatics, web mining, mining semistructured data, and so on. Existing techniques focus on finding “structural” patterns and ignores the “semantics” that may be associated with the subtrees. In this paper we proposal an algorithm to mine a novel pattern called frequent semantic tree cluster sequences (FRECLE), which captures the frequent sequential association between different semantics of tree-structured data. Given a semantic tree sequence database, the algorithm first categorizes each semantic tree to a semantic cluster. Next, FRECLE patterns are discovered from the semantic cluster sequences by adopting an existing frequent sequential pattern mining algorithm. FRECLE patterns are beneficial in applications where the knowledge of semantic association is significant, such as XML query caching, prefetching XML data, and web users clustering. Specifically, we show how our proposed FRECLE mining framework can be used for designing optimal XML query cache replacement strategy. Finally, by reporting the performance of our algorithm and caching strategy through extensive experiments with both synthetic and real datasets, we show the effectiveness and usefulness of FRECLE mining.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining of Users’ Access Behaviour for Frequent Sequential Pattern from Web Logs

Sequential Pattern mining is the process of applying data mining techniques to a sequential database for the purposes of discovering the correlation relationships that exist among an ordered list of events. The task of discovering frequent sequences is challenging, because the algorithm needs to process a combinatorially explosive number of possible sequences. Discovering hidden information fro...

متن کامل

Efficient Substructure Discovery from Large Semi-structured Data

By rapid progress of network and storage technologies, a huge amount of electronic data such as Web pages and XML data [23] has been available on intra and internet. These electronic data are heterogeneous collection of ill-structured data that have no rigid structures, and often called semi-structured data [1]. Hence, there have been increasing demands for automatic methods for extracting usef...

متن کامل

Discovering Minimal Infrequent Structures from XML Documents

More and more data (documents) are wrapped in XML format. Mining these documents involves mining the corresponding XML structures. However, the semi-structured (tree structured) XML makes it somewhat difficult for traditional data mining algorithms to work properly. Recently, several new algorithms were proposed to mine XML documents. These algorithms mainly focus on mining frequent tree struct...

متن کامل

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

Introduction: Autism is a nervous system disorder, and since there is no direct diagnosis for it, data mining can help diagnose the disease. Ontology as a backbone of the semantic web, a knowledge database with shareability and reusability, can be a confirmation of the correctness of disease diagnosis systems. This study aimed to provide a system for diagnosing autistic children with a combinat...

متن کامل

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

FRECLE Mining: Discovering Frequent Semantic Tree Cluster Sequences from Historical Tree Structured Data

نویسندگان

چکیده

منابع مشابه

Mining of Users’ Access Behaviour for Frequent Sequential Pattern from Web Logs

Efficient Substructure Discovery from Large Semi-structured Data

Discovering Minimal Infrequent Structures from XML Documents

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

عنوان ژورنال:

اشتراک گذاری